Skip to content

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Jun 14, 2025

This is an initial implementation for gauge fixing a PEPS based on the algorithm described here

From what I can tell it is mostly functional, although it requires some tests etc to actually see how well it performs in practice.
The idea would be to attempt to stabilize some gradient methods by re-gauging the state every N steps.

TODO:

  • Support for fermions.
  • Enforce positive semi-definiteness of message matrices.
  • Allow any virtual leg arrow directions in the PEPS.
  • Rotation (rotl90 etc.) of BPEnv.
  • Conversion from BPEnv to CTMRGEnv and SUWeight.

@codecov
Copy link

codecov bot commented Jun 14, 2025

Codecov Report

❌ Patch coverage is 90.55556% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/algorithms/contractions/bp_contractions.jl 86.13% 14 Missing ⚠️
src/environments/bp_environments.jl 87.75% 12 Missing ⚠️
src/algorithms/bp/beliefpropagation.jl 94.23% 3 Missing ⚠️
src/algorithms/time_evolution/gaugefix_su.jl 90.32% 3 Missing ⚠️
src/algorithms/bp/gaugefix.jl 96.82% 2 Missing ⚠️
Files with missing lines Coverage Δ
src/PEPSKit.jl 100.00% <ø> (ø)
src/environments/ctmrg_environments.jl 81.50% <100.00%> (+0.89%) ⬆️
src/networks/local_sandwich.jl 92.85% <100.00%> (ø)
src/utility/util.jl 72.27% <100.00%> (+1.44%) ⬆️
src/algorithms/bp/gaugefix.jl 96.82% <96.82%> (ø)
src/algorithms/bp/beliefpropagation.jl 94.23% <94.23%> (ø)
src/algorithms/time_evolution/gaugefix_su.jl 90.32% <90.32%> (ø)
src/environments/bp_environments.jl 87.75% <87.75%> (ø)
src/algorithms/contractions/bp_contractions.jl 86.13% <86.13%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Jun 15, 2025

I remember that BP is equivalent to simple update (PhysRevResearch.3.023073), although I haven't looked at the details (such as whether the bond messages are guaranteed to be real diagonal, and thus already represented by SUWeight). We may use this equivalence to avoid introducing redundant structs, or generalize the existing simple update algorithm.

BTW, YASTN people are also working on BP, and with NNN Hamiltonian support (in one of the branches) as well.

@lkdvos
Copy link
Member Author

lkdvos commented Jun 15, 2025

I looked into trying to reuse some of that functionality, and at least for the BPEnv struct there really are substantial differences: note how there are 2 messages per edge, one going in each direction, which are connected between bra and ket, instead of between the virtual edges of the PEPS. That's why I ended up modeling this more like the CTMRGEnv, where I'm effectively storing a CTMRG environment with chi = 1 trivial legs. (I'll try add a converter/promoter from BPEnv to CTMRGEnv to at some point.)

I do agree that it would be nice to join the simple update part, but for now I was actually mostly interested in affecting the gauge of a PEPS, and we can look into extending this to actually do updates in the future.

It's nice to hear that YASTN also are finding this helpful! I don't think adding NNN support would be too hard, the contractions with these rank 1 environments are a lot more manageable in general.
Something I would like to look into at some point is improving the accuracy with loop corrections, see https://arxiv.org/pdf/2409.03108

@pbrehmer
Copy link
Collaborator

Thanks for getting this started! I'd be happy to join in on this since I was recently talking to Bram about PEPS optimization, how well it might perform and how that might be related to gauge freedoms. Specifically, I was looking at optimization runs that get stuck in linesearching at some point - e.g. when choosing a CTMRG environment dimension that is too low. Around these iterations where the optimizer gets stuck I computed cuts through the energy landscape along the current gradient directions and I was getting all sorts of weird curves with cusps or discontinuities (inspired by Appendix B of the periodic PEPS paper: https://arxiv.org/pdf/2411.12731).

For example, a Heisenberg model at $D=3$ and $\chi=4$ will look like this ($i$ being the LBFGS iteration, $g$ the gradient at iteration $i$):
image

At $D=3$ and $\chi=8$ it looks a bit more wild:
image

In these cases you can really see that a low environment dimension will first get you smooth convex functions that become more closely centered around $\alpha=0$ and then cusps and discontinuities may appear. In these cases the linesearching algorithms of course run into trouble and eventually may spit out unphysical optimization steps. Also, this process of breaking down might stretch over many LBFGS iterations.

I seemed that gauging by symmetrizing the PEPS spatially did improve the energy landscapes but I haven't properly investigated that. In any case, I would be quite interested to see how BP gauging could improve optimization convergence and perhaps stabilize more complicated optimization runs.

@Yue-Zhengyuan
Copy link
Member

Something I would like to look into at some point is improving the accuracy with loop corrections, see https://arxiv.org/pdf/2409.03108

I also have wanted to try it out for quite a while, and see how it can systemically improve over SU.

To suggest some tests for the current PR, we can first check that if the bond weights produced by SU (with identity gates) are consistent with the BP message tensors.

@lkdvos
Copy link
Member Author

lkdvos commented Jun 16, 2025

@pbrehmer these are some really cool plots! This was indeed one of the primary use-cases I had in mind, so it would be cool to see how this alters these gradient plots.

As a side note, at such low bond dimensions I would also be slightly cautious about picking bond dimensions that necessarily break the symmetry, for example in a U1 case with charge conjugation I would expect that you might need to check for each D whether or not it gives a valid combination. This is motivated by similar problems for MPS simulations, where certain bond dimensions don't work because they have to cut inbetween multiplets.

@GlebFedorovich
Copy link

GlebFedorovich commented Jun 17, 2025

@pbrehmer Indeed, we have seen this a lot for periodic PEPS, and pushing $$\chi$$ and imposing some rotation and mirror symmetries help to stabilize the optimization and avoid these discontinuous points. However, there are many cases, beyond ising and heisenberg models, when imposing spatial symmetries leads to higher energies compared let's say with just SU (or with AD fixed-point one before the optimization arrives at "bad" point :) )

@pbrehmer
Copy link
Collaborator

However, there are many cases, beyond ising and heisenberg models, when imposing spatial symmetries leads to higher energies compared let's say with just SU (or with AD fixed-point one before the optimization arrives at "bad" point :) )

Yes I have also noticed that recently in some cases! Good to know that that seems to be consistent with your findings :)

Copy link
Member

@Yue-Zhengyuan Yue-Zhengyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since adding BP may need some revisions for SU and InfiniteWeightPEPS, I want to ask a few things to get more familiar with it.

@kshyatt
Copy link
Member

kshyatt commented Nov 14, 2025

Resolved the src/PEPSKit.jl conflict if anyone wants to pick this back up

@github-actions
Copy link
Contributor

github-actions bot commented Nov 14, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

Comment on lines +97 to +99
D, V = eigh_full(A)
sqrtA = V * sdiag_pow(D, 1 / 2) * V'
isqrtA = V * sdiag_pow(D, -1 / 2) * V'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have an unfortunate situation with fermions: because of the axis order ket (top) <- bra (bottom), the eigenvalues in the parity-odd sector are negative. So, we cannot directly apply sdiag_pow.

Copy link
Member

@Yue-Zhengyuan Yue-Zhengyuan Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for bosons, if the BPEnv is initialized with randn instead of ones, there can also be negative eigenvalues of the messages. In more extreme cases there can be complex eigenvalues. In such (non-Hermitian) cases I guess the gauge fixing should still be done with SVD instead of eigen-decomposition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, because of the kwarg tol::Real = eps(scalartype(s))^(3 / 4), sdiag_pow cannot handle complex DiagonalTensorMaps unless tol is explicitly provided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that kwarg should have been eps(real(scalartype(s)))^(3/4).

For the remainder, I indeed messed up: the property to check is positive definiteness (which implies hermitian), and hermitian by itself isn't enough.

However, for the fermions I would like to investigate this more closely. Looking at eq (124) in the fermionic tensor network methods paper, we can investigate the special case of BP where the messages are the identity (e.g. in a tree or MPS where the tensors have been properly gauged), and we see that indeed we expect negative values in the parity-odd sector, so that seems correct.
However, I think we should be able to (and probably want to!) redefine or contractions by adding this twist explicitly, which will then restore the positive definiteness of our messages.

Thinking about that a bit more, I think this is effectively what we were doing by having the other index order for the messages as well, since for these (1, 1) tensors a permutation and a twist + planar transpose should effectively be the same thing.

Concretely, what I'm suggesting here is to explicitly factorize the message into a twist + the message, thereto adding a twist in the bp update contractions, as well as in the expectation value contractions, which will then make everything positive definite again.

I think a good test case could actually be a product state PEPS which has as virtual spaces Vect[FermionParity](1 => 1).
Because of the product state structure, the messages should (up to normalization) all be the identity and positive definite, no matter the unit cell.
I think that for this case, missing twists would make odd unit cell sizes not converge, and even unit cell sizes get staggering positive and negative messages (might be wrong about this though)

Does any of this make sense, or is this starting to feel like too much? Feel free to disagree with me here, I think most of my reasoning here comes from the intuition we obtained from that paper, which I'm trying to couple back to, but alternative viewpoints are definitely also appreciated!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that kwarg should have been eps(real(scalartype(s)))^(3/4)

I have made this change. Glad that's the way to go.

Concretely, what I'm suggesting here is to explicitly factorize the message into a twist + the message, thereto adding a twist in the bp update contractions, as well as in the expectation value contractions, which will then make everything positive definite again.

This is a good idea. The twists I added may actually be implicitly doing this.

@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Jan 8, 2026

The tests now pass for all cases I can think of (bosons vs fermions, positive vs arbitrary messages, and standard vs random virtual PEPS arrows). For fermions I introduced some twists to make things work at some seemingly unexpected positions. I'll try to justify them later.

  • It might be more convenient in the long run to have the messages follow the "canonical arrow" convention, such that north and east we get top <- bot and south and west we get bot <- top

Now I use this axis order convention suggested by @lkdvos earlier, without additional adjoints.

@lkdvos
Copy link
Member Author

lkdvos commented Jan 8, 2026

Sorry I commented on the subcomments before I saw your full comment here.
I am guessing that the choice to omit the adjoints is actually exactly what is causing these twists to appear, as an adjoint would transpose without actually having to introduce a minus sign, but I'm not 100% sure about that. In any case, I would also be happy with the current choices, as that is more close to how we described the algorithms in the MPS case in the fermionic methods paper, although I still think in that case the twists are better placed in the contractions, rather than absorbed in the messages.

@lkdvos lkdvos dismissed a stale review January 8, 2026 13:49

unknown user

@lkdvos
Copy link
Member Author

lkdvos commented Jan 8, 2026

I cannot approve since this is my own PR, but I think most of the TODOs and comments have been addressed, so I would be happy to get this one merged.

@Yue-Zhengyuan
Copy link
Member

The only remaining question is whether to absorb the twists into the messages. @lkdvos do you think it's OK to address this later if needed? @kshyatt Maybe you can have a final review and approve.

@lkdvos
Copy link
Member Author

lkdvos commented Jan 8, 2026

Yes, I think this PR has lived for too long and it will be easier to review that change in a separate PR

@kshyatt
Copy link
Member

kshyatt commented Jan 8, 2026 via email

@Yue-Zhengyuan Yue-Zhengyuan merged commit 7865d5e into master Jan 8, 2026
63 checks passed
@Yue-Zhengyuan Yue-Zhengyuan deleted the bp branch January 8, 2026 23:49
dir, row, col = Tuple(I)
@assert dir == NORTH || dir == EAST
M12 = env[dir, dir == NORTH ? _prev(row, end) : row, dir == EAST ? _next(col, end) : col]
sqrtM12, isqrtM12 = sqrt_invsqrt(twist(M12, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This twist is used to cancel the twist introduced because of the axis order choice ket <- bra for the north and east messages, in order to obtain a positive definite map.

X = isqrtM12 * U * sqrtΛ
invX = sqrtΛ * Vᴴ * isqrtM21
if isdual(space(sqrtM12, 1))
X, invX = twist(flip(X, 2), 1), flip(invX, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This twist is used to cancel the twist introduced by flip when reversing the arrow direction. It can be equally applied to invX instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Will we one day deal with sectors that cannot be flipped?)

@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Jan 9, 2026

I added some comments to justify the twists used in the gauging part. Now I feel that it is better to keep the current design, so we don't need to worry about manually putting back the twists during BP iterations or evaluation of expectation values (which can become a bit annoying IMO: now we need to add twists only in two places in the code, but otherwise there will be many more).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants